Statistical methods for epidemiology

Type of study

What is epidemiology

  • Epidemiology is a study of identifying the association between exposures (cause, risk factors) and disease outcome in specific human population, and the application of this study to control health problems.

Type of research studies

  • Sample survey

    • Health interview surveys
  • Epidemiological studies

    • Observational, not experimental

    • Case-control studies: association between current disease and past exposure

      • A group of subjects who had lung cancer and a group of people who did not have lung cancer patients were included in the study, and they were interviewed about their smoking habits in the past.
    • Prospective cohort studies: association between current exposure and subsequent development of disease.

      • Invite a group of subjects to complete a short questionnaire about their smoking habits, then all of these respondents were then followed up.
  • Clinical trials

    • experimental

Measures of Disease Occurrence

Two measures

  • prevalence of disease
    • Disease proportion of subjects in population
  • Incidence of disease
    • the rate that new disease cases arise

Point prevalence, incidence Proportion and incidence rate

  • Point prevalence

    image-20200917132358248

  • Incidence proportion

    image-20200917132443787

  • Incidence Rates

    image-20200917132739988

  • Difference between incidence proportion and incidence rate:

    The definition of an incidence proportion assumes a closed population. That is no new subjects at risk are allowed to enter. This restriction is relaxed when using an incidence rate.

Instantaneous Rate $\lambda(t)$

  • Why we need this measure?

    The disease development changes substantially in a certain time interval. In order to catch the dynamic pattern, we need to consider the incidence rate in a short interval.

  • It is also called force of mortality in actuarial science and hazard rate in survival analysis

  • Let $P(t)$ be the probability that the subjects acquires disease prior to time t. $\lambda(t)$Is a conditional probability image-20200917133911569

  • The cumulative hazard rate is defined as:

    image-20200917134106492

  • From $\lambda(t) = -dlog(1-P(t))/dt$, we get image-20200917134432708

  • Important special case:

    • $\lambda(t) = \lambda$ Is a constant,image-20200917134605946

      This is the exponential distribution

    • Suppose we have a group of subjects I = 1,2,…,n, we know the death time of them. How can we estimate $\lambda$?

      image-20200917135136325

    Do MLE, we can get image-20200917135157986

    - If we do not observer all subjects to fail then the likelihood for those subjects is simply the probability of reaching time Ti without failing which is 1-P(Ti)
    - ![image-20200917135506280](https://i.imgur.com/2I6Zaiv.png)
    

Review of basic statistical theory

Details can be seen in my another blog: statistics theory.

Measures of Disease exposure association

Notation

  • D: disease occurs
  • $\bar{D}$: disease does not occur
  • E: exposed to a certain risk factor
  • $\bar{E}$: not exposed to a certain risk factor
  • image-20200918140438656

Excess Risk

  • Excess risk (ER) for disease D associated with exposure E is defined as

    image-20200918141002263

  • ER is also called risk difference

  • ER = 0: no association

  • ER>0: exposure increases the risk of disease

  • ER<0: exposure decrease the risk of disease

Relative Risk

  • image-20200918144955329

  • image-20200918145042170

Odds Ratio

  • The odds of disease D is defined to be:

    image-20200918145457547

  • Odds ratio for disease D associated with exposure E is defined as:

    image-20200918145540775

  • OR = 1, OR>1, OR<1

  • OR is symmetric in the role of D and E:

    image-20200918145639212